Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome
نویسندگان
چکیده
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance.
منابع مشابه
Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species
Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...
متن کاملTranscriptome Sequencing of Guilan Native Cow in Comparison with bosTau4 Reference Genome
RNA-sequencing is a new method of transcriptome characterization of organisms. Based on identity and relatedness, there are large genetic variations among different cattle breeds. The goal of the current study was to sequence the transcriptome of Guilan native cow and compare with available reference genome using RNA-sequencing method. Blood samples were collected from 14 Guilan native cows and...
متن کاملA comparative phylogenetic analysis of Theileria spp. by using two two "18S ribosomal RNA" and "Theileria annulata merozoite surface antigen" gene sequences
More than 185 species, strains and unclassified Theileria parasites are categorized in the Entrez Taxonomy. The accurate diagnosis and proper identification of the causative agents are important for understanding the epidemiology, prevention and appropriate treatment. This study aims to discuss the importance of two genes of Theileria annulata 18S ribosomal RNA (18S rRNA) and Theileria annulata...
متن کاملPseudogenes
Pseudogenes are ubiquitous and abundant in genomes. Pseudogenes were once called "genomic fossils" and treated as "junk DNA" several years. Nevertheless, it has been recognized that some pseudogenes play essential roles in gene regulation of their parent genes, and many pseudogenes are transcribed into RNA. Pseudogene transcripts may also form small interfering RNA or decrease cellular miRNA co...
متن کاملAn Enterovirus-Like RNA Construct for Colon Cancer Suicide Gene Therapy
Background: In gene therapy, the use of RNA molecules as therapeutic agents has shown advantages over plasmid DNA, including higher levels of safety. However, transient nature of RNA has been a major obstacle in application of RNA in gene therapy. Methods: Here, we used the internal ribosomal entry site of encephalomyocarditis virus and the 3’ non-translated region of Poliovirus to design an en...
متن کامل